105 research outputs found

    Incentivizing Exploration with Heterogeneous Value of Money

    Full text link
    Recently, Frazier et al. proposed a natural model for crowdsourced exploration of different a priori unknown options: a principal is interested in the long-term welfare of a population of agents who arrive one by one in a multi-armed bandit setting. However, each agent is myopic, so in order to incentivize him to explore options with better long-term prospects, the principal must offer the agent money. Frazier et al. showed that a simple class of policies called time-expanded are optimal in the worst case, and characterized their budget-reward tradeoff. The previous work assumed that all agents are equally and uniformly susceptible to financial incentives. In reality, agents may have different utility for money. We therefore extend the model of Frazier et al. to allow agents that have heterogeneous and non-linear utilities for money. The principal is informed of the agent's tradeoff via a signal that could be more or less informative. Our main result is to show that a convex program can be used to derive a signal-dependent time-expanded policy which achieves the best possible Lagrangian reward in the worst case. The worst-case guarantee is matched by so-called "Diamonds in the Rough" instances; the proof that the guarantees match is based on showing that two different convex programs have the same optimal solution for these specific instances. These results also extend to the budgeted case as in Frazier et al. We also show that the optimal policy is monotone with respect to information, i.e., the approximation ratio of the optimal policy improves as the signals become more informative.Comment: WINE 201

    A Tight 2-Approximation for Preemptive Stochastic Scheduling

    Full text link

    Structure Learning in Human Sequential Decision-Making

    Get PDF
    Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. We formulate the problem of structure learning in sequential decision tasks using Bayesian reinforcement learning, and show that learning the generative model for rewards qualitatively changes the behavior of an optimal learning agent. To test whether people exhibit structure learning, we performed experiments involving a mixture of one-armed and two-armed bandit reward models, where structure learning produces many of the qualitative behaviors deemed suboptimal in previous studies. Our results demonstrate humans can perform structure learning in a near-optimal manner

    Methods for specifying the target difference in a randomised controlled trial : the Difference ELicitation in TriAls (DELTA) systematic review

    Get PDF
    Peer reviewedPublisher PD

    Strategies for the Use of Fallback Foods in Apes

    Get PDF
    Researchers have suggested that fallback foods (FBFs) shape primate food processing adaptations, whereas preferred foods drive harvesting adaptations, and that the dietary importance of FBFs is central in determining the expression of a variety of traits. We examine these hypotheses in extant apes. First, we compare the nature and dietary importance of FBFs used by each taxon. FBF importance appears greatest in gorillas, followed by chimpanzees and siamangs, and least in orangutans and gibbons (bonobos are difficult to place). Next, we compare 20 traits among taxa to assess whether the relative expression of traits expected for consumption of FBFs matches their observed dietary importance. Trait manifestation generally conforms to predictions based on dietary importance of FBFs. However, some departures from predictions exist, particularly for orang-utans, which express relatively more food harvesting and processing traits predicted for consuming large amounts of FBFs than expected based on observed dietary importance. This is probably due to the chemical, mechanical, and phenological properties of the apes’ main FBFs, in particular high importance of figs for chimpanzees and hylobatids, compared to use of bark and leaves—plus figs in at least some Sumatran populations—by orang-utans. This may have permitted more specialized harvesting adaptations in chimpanzees and hylobatids, and required enhanced processing adaptations in orang-utans. Possible intercontinental differences in the availability and quality of preferred and FBFs may also be important. Our analysis supports previous hypotheses suggesting a critical influence of the dietary importance and quality of FBFs on ape ecology and, consequently, evolution

    Safety out of control: dopamine and defence

    Full text link

    Neuroinflammation and psychiatric illness

    Get PDF
    • …
    corecore